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Abstract 

We examine the problem of jet reconstruction at heavy-ion colliders using jet-area-based 
background subtraction tools as provided by FastJet. We use Monte Carlo simulations with 
and without quenching to study the performance of several jet algorithms, including the option 
of filtering, under conditions corresponding to RHIC and LHC collisions. We find that most 
standard algorithms perform well, though the anti-fc t and filtered Cambridge/ Aachen algorithms 
have clear advantages in terms of the reconstructed pt offset and dispersion. 
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1 Introduction 



Since the appearance of the first high-energy colliders in the 1980's, the study of "jets" of particles 
produced in the final state has proved to be a powerful tool for probing the underlying elementary 
dynamics of the strong force as described by Quantum Chromodynamics (QCD). Jets have been 
extensively studied at e + e~ (LEP, SLC), ep (HERA) and pp/pp (Tevatron, RHIC, LHC) colliders, 
with a wide variety of jet algorithms, including infrared and collinear (IRC) safe algorithms such 
as those of refs. [TJ [21 g] . 

A topic of current interest is the use of jets in heavy-ion (HI) collisions, where, for example, 
they can be used to probe the hot, dense medium. In the past few years, the Relativistic Heavy 
Ion Collider (RHIC) has amassed significant numbers of copper-copper and gold-gold collisions at 
nucleon- nucleoli centre of mass energies up to y / s/v]v = 200 GeV, and the Large Hadron Collider 
(LHC) should deliver high yields of lead-lead collisions at much higher energies in the near future. 

The main obstacle to studying jets in HI collisions is the presence of the huge background given 
by the underlying event (UE) produced simultaneously with the hard nucleon-nucleon collision that 
initiates the high-transverse momentum jet of interest. This UE needs to be properly subtracted 
from the momentum of a given jet in order to reconstruct its "true" momentum, i.e. the one it 
would have in the absence of the UE contribution. This problem is of course well known, and is 
also present, though to a much smaller extent, in jet studies in proton-proton collisions. Various 
approaches to address it have been proposed (see e.g. [5[ O El [HJ [91 [TUl [HI fl2| [13] ) . 

In ref. [13] two of us proposed a jet area-related technique to determine the transverse mo- 
mentum density of a sufficiently uniformly distributed background and to subtract it from the jet 
momenta. The method of [2] introduced several novel steps, such as the measurement of jet ar- 
eas, and procedures to determine the transverse-momentum density of the underlying event and/or 
pileup. Subsequent work of ours [15], [16] has sought to provide firmer foundations for these concepts 
and methods, as well as practical tests in pp jet reconstruction tasks with simulated events [T7] . 
Preliminary experimental jet measurements from the STAR collaboration at RHIC, whose analysis 
is partially based on the ideas of [14] . have been presented in ref. [18]. 

In this article we give a systematic examination of the performance of such methods for heavy- 
ion collisions, applying them to Monte Carlo simulations for RHIC and the LHC. In particular, we 
determine the accuracy with which jet momenta can be effectively reconstructed, comparing the 
performance of several different jet algorithms. 

2 Challenges of jet reconstruction in heavy-ion collisions 

Jets in HI collisions are produced in an environment that is far from conducive to their detection 
and accurate measurement. Monte Carlo simulations (and real RHIC data) for gold-gold collisions 
at -\/snn = 200 GeV (per nucleon-nucleon collision) show that the transverse momentum density p 
of final-state particles is about 100 GeV per unit area (in the rapidity-azimuth plane). For lead- lead 
collisions at -y/sjvjv = 5.5 TeV at the LHC this figure is expected to increase by some factor ~ 2 — 3. 
This means that jets returned by jet definitions with a radius parameter of, e.g., R = 0.4, will 
contain background contamination of the order of 7ri? 2 p ~ 50 and 100 — 150 GeV respectively. 



A related, and perhaps more challenging obstacle to accurate jet reconstruction^ is due to the 
fluctuations both of the background level (from event to event, but also from point to point in a 

1 Note that we are considering the reconstruction of the jets exclusively at the particle level. Detector effects can 
of course be relevant, and need to be considered in detail, but are beyond the scope of our analysis, which is only 
concerned with the removal of the background from the raw jet momenta. Note also that we shall be using the terms 
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single event) and of the jet area itself: knowing the contamination level 'on average' is not sufficient 
to accurately reconstruct each individual jet. 

A wide range of strategies have already been advocated to reconstruct jets in HI collisions. 
Just to mention them briefly, approaches to HI jet reconstruction may involve one or more of the 
following measures: choosing a jet algorithm returning jets with an area as constant as possible; 
eliminating from the clustering all particles below a given transverse momentum threshold, say 
Pt,min ~ 2 GeV; measuring the background level in a region of the detector thought to be not 
affected by the hard event; or parametrising the average background level (and fluctuations) in 
terms of some other measured properties of the event, such as its centrality. 

Our framework has the following characteristics: 

• We shall restrict ourselves to IRC safe jet algorithms, to ensure that measurements can be 
meaningfully compared to higher-order perturbative QCD calculations. 

• The jet algorithms we use will not be limited to those yielding jets of regular shape. 

• Our analysis will avoid excluding small transverse momentum particles from the clustering. 
Doing so is collinear-unsafe and inevitably biases the reconstructed jet momenta, which must 
then be corrected using Monte Carlo simulations, which can have substantial uncertainties in 
their modelling of jet quenching and energy loss. Instead, we shall try to achieve a bias-free 
reconstructed jet, working with all the particles in the event H 

• The background level will be determined through adaptations of the jet-area/median proce- 
dure suggested in [TJ] and analysed in detail in [16]. This procedure is designed to give an 
estimate of the background that is minimally affected by the presence of hard jets. 

3 Simulation and analysis framework 
3.1 Hard and full events 

In order to test the effectiveness of subtracting the underlying event in a heavy-ion collision, and 
determine the quality of the jet reconstruction, we need access to the idealised hard jets, without 
the background, as a reference. This can be done by considering a simulated hard pp event first in 
isolation, and then embedded in a heavy-ion event. For clarity in the discussions, we shall adopt 
the following terminology: 

• the hard event refers to the hard pp event alone, without a heavy-ion background; 

• the full event refers to combination of the hard event and the HI background (which possibly 
includes many semi-hard events of its own); 

• hard jets and full jets refers to jets from the hard and full events, respectively. 

'(background-)subtracted momentum' and 'reconstructed momentum' equivalently. 

2 Our framework can, of course, also accommodate the elimination of particles with low transverse momenta, and 
this might help reduce the dispersion of the reconstructed momentum, albeit at the expense of biasing it. Whether one 
prefers to reduce the dispersion or instead the bias depends on the specific physics analysis that one is undertaking. 
Note also that detectors may effectively introduce low-pt cutoffs of their own. These detector artefacts should not 
have too large an impact on collinear safety as long as they appear at momenta of the order of the hadronisation 
scale of QCD, i.e. a few hundred MeV. 
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Note that though making the distinction between a hard and a full event is not possible with real 
data, it is feasible in certain Monte Carlo studies. One should nevertheless be aware that the extent 
to which such a distinction is physically meaningful is a question that remains open in view of the 
numerous issues related to the interaction between an energetic parton and the medium through 
which it travels. 

If the Monte Carlo program used to simulate heavy-ion events explicitly provides a separation 
between hard events and a soft part — e.g. as does HYDJET |19t [20] — one can extract one of the 
former as the single hard event and take the complete event as the full one. Alternatively, one can 
generate a hard event independently and embed it in a heavy-ion event (obtained from a Monte 
Carlo or from real collisions) to obtain the full event. This second approach, which we have adopted 
for the bulk of results presented here because of its greater computational efficiency, is sensible as 
long as the embedded hard event is much harder than any of the semi-hard events that tend to be 
present in the background. For the transverse momenta that we consider, this condition is typically 
fulfilled. Note however that for studies in which the presence of a hard collision is not guaranteed, 
e.g. the evaluation of fake-jet rates, one should use the first approach. 

3.2 Matching and quality measures 

To measure the performance of our heavy-ion background subtraction procedure we apply it to both 
hard and full events. We then take the two hardest resulting jets in each hard event and compare 
them to the corresponding subtracted jets in the full event. 

To do this in practice, we need a method to match the jets obtained from the full event to those 
from the hard event alone, to make sure that we are comparing the same object. In other analyses, 
this matching was typically performed in terms of the position of the jet in the rapidity-azimuth 
plane, requiring that the two jet axes not differ by more than a given AR = y Ay 2 + Ac/) 2 . The 
exact value of AR is an arbitrary choice, and values between 0.1 and 0.3 have been often used. 
In order to avoid the arbitrariness of the AR choice, we propose a different prescription: a jet 
reconstructed in the full event is considered matched to a hard jet if the constituents common to 
both the hard and the full jet make up at least 50% of the transverse momentum of the constituents 
of the hard j'e^H This definition has the advantage that for a given hard jet, at most one full jet can 
satisfy this criterion, therefore avoiding having to deal with multiple positive matchings. Another 
advantage is that it automatically rejects fake matchings given by a soft jet that happens to be 
close to a hard one. 

For a matched pair of jets, we shall use the notation p pp and pf A for the transverse momentum 
of the jet in the hard and full event respectively, and pP p,snh and p t ,sub for their subtracted 
equivalents. The quantity that we shall mostly concentrate on is the pt offset, 

a AA,sub pp, sub /-, 

Apt =P t ~ Pt > (!) 

between the momentum of the background-subtracted jet in the full event and its equivalent in 
the hard event. Large deviations from zero will indicate a poor subtraction. Events without any 
matched jets are simply discarded and instead contribute to the evaluation of matching inefficiencies 
(see section I4.ip . 

3 Actually, in our implementation, the condition was that the common part should be greater than 50% of the 
p t of the hard jet after UE subtraction as in section 13.51 In practice, this detail has negligible impact on matching 
efficiencies and other results. 



5 



The average of the pt shift over many events, {Ap t ), is only one measure that one may examine 
to establish the quality of jet reconstruction; its dispersion, 



can be another important one. This is especially true in the case of steeply-falling jet spectra, 
where this dispersion can have a large impact on the measured cross section, necessitating delicate 
deconvolutions, also known as "unfolding". 

In this paper we shall concentrate on these two quality measures (Ap t ) and o"A Pt , keeping only 
the pair of jets that have been matched to one of the two hardest (subtracted) jets in the hard 
event. Small (absolute) values of both (Ap t ) and <JA Pt will be the sign of a good subtraction. In 
practice a trade-off may exist in offset versus dispersion: which one to optimise may depend on the 
specific observable one wants to measured 

3.3 Monte Carlo simulations 

Quantifying the quality of background subtraction using Monte Carlo simulations has several ad- 
vantages. Besides providing a practical way of generating the hard "signal" separately from the 
soft background, one can easily check the robustness of one's conclusions by changing the hard jet 
or the background sample. 

One difficulty that arises in gauging the quality of jet reconstruction in heavy-ion collisions 
comes from the expectation that parton fragmentation in a hot medium will differ from that in 
a vacuum. This difference is often referred to as jet quenching [21] (for reviews, see [22] ). The 
details of jet quenching are far less well established than those of vacuum fragmentation and can 
have an effect on the quality of jet reconstruction. Here we shall examine the reconstruction of 
both unquenched and quenched jets. For the latter it will be particularly important to be able to 
test more than one quenching model, in order to help build confidence in our conclusions about any 
reconstruction bias that may additionally exist in the presence of quenching. 

In practice, for this paper we have used both the Fortran (vl.6) [19] and the C++ (v2.1) [20] 
versions of HYDJET to generate the background. Hard jets have been generated with PYTHIA 6.4 
[23], either running it standalone, or using the version embedded in HYDJET vl.6. The quenching 
effects have been studied using both QPYTHIA [24] and PYQUEN pi]. 

3.4 Jet definitions 

As mentioned in the Introduction, the last few years have seen many developments in the field 
of jet clustering, with the appearance of fast implementations [25] of the kt p] and the Cam- 
bridge/Aachen [2] sequential recombination algorithms, and the introduction of the SISCone [I] 
and anti-fcj [3 J algorithms. 

Besides these four main infrared-and-collinear-safe jet algorithms, a number of recent papers 
(see e.g. [261CE7]), have introduced and studied jet-cleaning techniques in order to help the recon- 
struction of jets by reducing the UE contamination. "Filtering" [26] works by reclustering each 
jet with a radius Rsit smaller than the original radius R and keeping only the nait hardest subjets 
(background subtraction is applied to each of the subjets before deciding which ones to keep). Other 
similar techniques, "trimming" and "pruning" , that exploit the substructure of jets have also been 

4 Note also that (Apt) and o"A Pt may not, in general, fully characterise the quality of the reconstruction, as the 
distribution of Ap t may be non-Gaussian. 




(2) 



6 



Global Circular(A) 




Figure 1: Graphical representation of the 4 different background-estimation ranges we shall consider: 
the Global range, the Strip range 5a (J), the Circular range Ca(j') and the Doughnut range Dg a 0)- 
The last three are local ranges with a position depending on the jet being subtracted. See the text 
for detailed definitions. 

introduced recently \27\ [28] and, in pp environments, have been found to give benefits comparable 
to those of filtering. Note that the filtering that we use here is unrelated to the Gaussian filter 
approach of [9] (not used in this paper as no public code is currently available). 

In analogy with the systematic analysis of the performance of various jet definitions in the kine- 
matic reconstruction of dijet systems at the LHC |17j . here we will study how different algorithms 
behave in the case of HI collisions. We will use the kt, Cambridge/Aachen (C/A), and anti-fct algo- 
rithms, as well as filtering applied to Cambridge/ Aachen (C/A(filt)) with Rfm = R/2 and n^\ t = 2. 
We have not used SISCone, as its relatively slower speed compared to the sequential recombination 
algorithms makes it less suitable for a HI environment. 

All the algorithms have been used through their fast implementation available in the FastJet 
package |25} I29j. version 2.4.2. Additionally some features of a forthcoming FastJet release have 
been used to help simplify our analyses. 

3.5 Background determination and subtraction 

In order to subtract the HI background from the hard jets, we will mainly follow the method 
introduced in |14| . When clustering the event we determine the 4- vector active area Aj |15j of each 
jet j, as well as an estimate of the background density p. Then for each jet j, we subtract from its 
four-momentum the expected background contamination: 

# SUb = " PA* • (3) 
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The background transverse-momentum density per unit area, p, is determined, event-by-event, as 
proposed in p3]. In that paper two ways of determining p were explored. One was to take a global 
background-estimation range 1Z, covering the full rapidity-azimuth plane up to some rapidity y max ; 
one then considered the set of ratios of transverse components of the momentum and area 4- vectors, 
Pjj/Ajj, for the jets within this range. The median of this set was used as an estimate for p: 

pn = median \ \ , (4) 



where the subscript 1Z explicitly denotes that the background density has been calculated using only 
jets in the range 1Z. This method assumes that the background level is sufficiently constant within 
1Z. This condition is known to be violated when going to sufficiently large rapidity, and the effect 
is particularly marked in heavy ion collisions. For this reason, it was alternatively proposed in |14j 
to fit the mean Pj,t/Aj t as a function of rapidity y with a quadratic functional form p = po + p2y 2 ■ 
While working on this paper, and partly inspired by discussions with members of the STAR 
collaboration (c.f. also ref. [18]). we came to the conclusion that neither of these two methods allows 
one to extract sufficiently accurate values of p in the context of HI collisions. We propose therefore 
a variant of the first method, namely making the background-estimation range local and dependent 
on a given jet's position. Graphical representations of a global range, as well as of three possible 
local ranges that we shall use hereafter, are given in fig. [TJ More specifically, for a given jet j, our 
three local ranges are defined as follows]! 

• the Strip range, 5a (j), includes the jets j' satisfying \yj/ — yj\ < A, 



• the Circular range, Ca(j), includes the jets j' satisfying y/ (yj/ —yj) 2 + {tyj' — ^j) 2 < A> 

• the Doughnut range, £>5,a(j)i includes the jets j' satisfying 5 < a/ (yj' —Uj) 2 + {4 > j'~4>j) 2 < A. 

Using IZ(j) to denote any local range around jet j, the background density will then depend on the 
jet being subtracted and will be estimated using 

Pn{j) = median <^ ^ \ . (5) 



It is important to be aware that the estimate of the background density p is just an input to eq. (|3|) 
and that the jet definition that is used for the computation of p in eqs. Q or J5| can be different 
from the jet definition used to obtain the "physical" jets and subtract them in eq. ([3]). The only 
two recommended jet algorithms for the task of determining p are the kf and C/A algorithms (see 
|14j); others are not ideal because they return many jets with very small areas, which distorts the 
median procedure. The radius, R p , used in the jet definition for determining p can also differ from 
that, R, used to find the jets. 

When using a local range, the underlying idea is to limit the sensitivity to the long-range 
variations of the background density, by using only the jets in the vicinity of the jet we want to 
subtract. In practice, a compromise needs to be found between choosing a range small enough to get 
a valid local estimation, but also large enough to contain a sufficiently large number of background 
(soft) jets for the estimation of the median to be reliable. Two effects need to be considered: (1) 
statistical fluctuations in the estimation of the background and (2) biases due to the presence of a 
hard jet in the region used to estimate the background. 



The CircularRange is distributed with the current FastJet release (v2.4.2) and the StripRange can be simulated 
from the default RangeDefinition. A more systematic approach to local ranges, including the DoughnutRange, will 
be available in the forthcoming FastJet release. 
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1. If we require that the dispersion in the reconstructed jet pt coming from the statistical fluc- 
tuations in the estimation of the background does not amount to more than a fraction e of 
the overall a^ pt , then as discussed in appendix I A. II the range needs to cover an area An such 
that 

A w > — . (6) 

Taking e = 0.1 and R = 0.4 this corresponds to an area An > 25R 2 ~ 4. For the applications 
below, when clustering with radius R, we have used the rapidity-strip ranges S2R and S3R, 
the circular range C%r and the doughnut range T>r^r. One can check that their areas are 
compatible with the 25R 2 lower-limit estimated above. 

2. The bias in the estimate of p due to the presence of rib hard jets in the range is given roughly 
by 

rib 
'A n 

as discussed in [16] and appendix lA.2[ The ensuing bias on the pt can be estimated as (Ap)irR 2 
(for anti-kt jets). For R = 0.4, R p = 0.5, An — 4 and rib = 1, the bias in the reconstructed jet 
Pt is ~ O.lcr. Given a in the 10 — 20GeV range (as we will find in section H]) this corresponds 
to a 1 — 2 GeV bias. In order to eliminate this small bias, we will often choose to exclude the 
two hardest jets in each event when determining of pjf] 

A third potential bias discussed in ref. |16j is that of underestimating the background when using 
too small a value for R p . Given the high density of particles in HI collisions, this will generally not 
be an issue as long as R p ~ 0.5. 



{Ap)~1.8aR pl ± t (7) 



4 Results 

As mentioned already in section [3T31 for the simulation of the events, we have used PYTHIA 6.4 |23] 
to generate the unquenched hard jets; the background events are generated using HYDJET vl.6 |19] 
with 0-10% centralitjlll Our setup for the background leads, for RHIC (AuAu, y / Sj V "iv = 200 GeV), 
to an average background density per unit area at central rapidity of (p) ~ 99 GeV with average 
fluctuations in a single event of (a) ~ 8 GeV and event-to-event fluctuations a p = y (p 2 ) — (p) 2 ~ 
14 GeV. For the LHC (PbPb, ^snn = 5.5 TeV) the corresponding values are (p) ~ 310 GeV, 
(a) ~ 20 GeV and event-to-event fluctuations a p ~ 45 GeV. Fig. [2] shows the distributions obtained 
from the simulations for p and a. 

In the case of the RHIC simulation, the result for (p) is somewhat higher than the experimental 
value of 75 GeV quoted by the STAR collaboration [18] , however this is probably in part due to 
limited tracking efficiencies at low pt at STAr|1 and explicit STAR [30] and PHENIX [31] results 

6 We deliberately choose to exclude the two hardest jets in the event, not simply the two hardest in the range. Note, 
however, that for realistic situations with limited acceptance, only one jet may be within the acceptance, in which case 
the exclusion of a single jet might be more appropriate. Excluding a second one does not affect the result significantly, 
so it is perhaps a good idea to use the same procedure regardless of any acceptance-related considerations. 

7 The events have been generated using the following HYDJET vf .6 program parameters: for RHIC, nh = 9000, 
ylf 1 = 3.5, ytf 1 = f .3 and ptmin= 2.6 GeV; for LHC, nh = 30000, ylf 1 = 4, ytf 1 = f .5 and ptmin = 10 GeV. In both 
cases quenching effects are turned on in HYDJET, nhsel = 2, even when they are not included for the embedded 
pp event. The corresponding PYQUEN parameters we have used are ienglu = 0, ianglu = 0, TO = 1.0 GeV, 
tauO = 0.1 fm and nf = 0. 

8 We thank Helen Caines for discussions on this point. 
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Figure 2: Distribution of the background density p per unit area (left) and its intra-event 
fluctuations a (right). It has been obtained from 5000 HYDJET events with RHIC (AuAu, 
\J s nn = 200 GeV) and LHC (PbPb, ^/snn = 5.5 TeV) kinematics. The background properties 
have been estimated using the techniques presented in section I3.51 using the kt algorithm with 



R p = 0.5, and keeping only the jets with \y\ < 1 (excluding the two hardest). 



for dEt/drj correspond to somewhat higher p values, about 90 GeV. The multiplicity of charged 
particles (dN c h/dr] ~ 660 for ij = and 0-6% centrality) and the pion pt spectrum in our simulation 
are sensible compared to experimental measurements at RHIC \32\ [33], [34"] . For the LHC our charged 
particle multiplicity is dN^/dr/ ~ 1600 for rj = and 0-10% centrality, which is comparable to 
many of the predictions reviewed in fig. 7 of |35| . 

An independent control analysis has also been performed with HYDJET++ 2.1 [20J (with default 
parameters) for the background. The results at RHIC are similar, while for LHC the comparison is 
difficult because the default tune of HYDJET++ 2.1 predicts a much higher multiplicity, dN^/dr] ~ 
2800 for 7/ = and 0-10% centrality. 

Most of the results of this section will be obtained without quenching, though in section 14.51 
we will also consider the impact on our conclusions of the PYQUEN [19J and QPYTHIA |24j 
simulations of quenching effects. 

For the results presented below, we have employed a selection cut of \y\ < y max on the jets with 
VmsLx = 1 for RHIC and y max = 2.4 for the LHC|j We only consider full jets that are matched to 
one of the two hardest jets in the hard event. The computation of the jet areas in FastJet, needed 
both for the subtraction and the background estimation, has been performed using active areas, 
with ghosts up to ?/ m ax + 1-8, a single repetition and a ghost area of 0.010 The determination of 
the background density p has been performed using the kt algorithm with R p = 0.5. Though the 

9 This corresponds roughly to the central region for the ATLAS and CMS detectors. For ALICE, the acceptance 
is more limited [6] [12] . Some adaptation of our method will be needed for estimating p in that case, in order for 
information to be derived from jets near the edge of the acceptance and thus bring the available area close to the 
ideal requirements set out in appendix [X] Note that we also use particles beyond y max in the jet clustering and apply 
the acceptance cut only to the resulting jets. 

10 The active area [IS] is the natural choice for subtraction as it mimics the uniform soft background. We also use 
the "explicit ghosts" option of FastJet, which gives a better computation of the empty area in sparse events. For 
the C/A algorithm with filtering, explicit ghosts also allow for subtraction of each individual subjet before selecting 
the two hardest subjets. Finally, note that since we are mostly dealing with high-multiplicity events, the difference 
between active and passive areas is negligible, and we could in some cases also have used the latter {e.g. to limit 
certain speed and memory issues if we had used SISCone). 
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Figure 3: Matching efficiency for reconstructed jets as a function of the jet pt- Left: RHIC, right: 
LHC. These results are independent of the choice of background-subtraction range in the heavy- 
ion events, since background subtraction does not enter into the matching criterion. Here and in 
later figures, the label "unquenched" refers to the embedded pp event; the background is always 
simulated including quenching. 

estimate of the background depends on R p [16], we have observed that choices between 0.3 and 0.5 
lead to very similar results (e.g. differing by at most a few hundred MeV at RHIC). 

With this setup, we have studied the various ranges presented in section 13. 5| with the jet 
algorithms from section 13.41 In all cases, we have taken the radius parameter R = 0.4. We have 
adopted this value as it is the largest currently used at RHIC. Note that the effect of the background 
fluctuations of the jet energy resolution increases linearly with R, disfavouring significantly larger 
choices. On the other hand, too small a choice of R may lead to excessive sensitivity to the details 
of parton fragmentation, hadronisation and detector granularity. 

4.1 Matching efficiency 

Let us start the presentation of our results with a brief discussion of the efficiency of reconstructing 
jets in the medium. As explained in section 13.21 the jets in the medium are matched to a "bare" 
hard jet when their common particle content accounts for at least 50% of the latter's transverse 
momentum. 

The matching efficiencies we observe depend to some extent on the details of the Monte Carlo 
used for the background so our intention is just to illustrate the typical behaviour we observe and 
highlight that these efficiencies tend to be large. We observe from fig. [3] that we successfully match 
at least 95% of the jets above pt ^ 15 GeV at RHIC, and at least 99% of the jets above pt ^ 60 
GeV at the LHC. It is also interesting to notice that the anti-/^ algorithm performs best, likely as 
a consequence of its 'rigidity', namely the fact that anti-fc< jets tend to have the same (circular) 
shape, independently of the soft-particles that are present. 

4.2 Choice of background-estimation range 

We now turn to the results concerning the measurement of the background density and the recon- 
struction of the jet transverse momentum. We first concentrate on the impact of the choice of a 
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Figure 4: Effect of the choice of range on the average pt shift, Apt, as defined in eq. ([I]). Left: 
RHIC, right: LHC. In this figure and those that follow, the yellow band corresponds to 1% of the 
p t of the hard jet. 



local range and/or of the exclusion of the two hardest jets when determining pf^\ 

In fig.[5]we show the average shift {Apt) for the list of ranges mentioned in section [331 The results 
presented here have been obtained with the anti-kt algorithm with R = 0.4, but the differences 
among the various range choices have been seen to be similar with other jet definitions. The label 
"2 excl" means that the two hardest jets in the event have been excluded from the estimation 
of the background. We have found that this improves the precision of the subtractions whenever 
expected, i.e. for all choices of range except the doughnut range, where its central hole already acts 
similarly to the exclusion of the hardest jets. To keep the figure reasonably readable, we have only 
explicitly shown the effect of removing the two hardest jets for the global range. The change of 
0.4-0.6 GeV (both for RHIC and the LHC) is in reasonable agreement with the analytic estimate 
of about 0.6 GeV for RHIC and the LHC obtained from (Apt) = ttR 2 (Ap) with (Ap) calculated 
using eq. ([7]). Note that at LHC the exclusion of the two hardest jets for the global range appears 
to worsen the subtraction, however what is really happening is that the removal of the two hardest 
jets exacerbates a deficiency of the global range, namely the fact that its broad rapidity coverage 
causes it to underestimate p, leading to a positive net (Ap). 

Other features that can be understood qualitatively include for example the differences between 
the two strip and the global (2 excl) range for RHIC: while the rapidity width of the global range lies 
in between that of the two strip ranges, the global range gives a lower (Apt) than both, corresponding 
to a larger p estimate, which is reasonable because the global range is centred on y = 0, whereas 
the strip ranges are mostly centred at larger rapidities where the background is lower. 

The main result of the analysis of fig. [J] is the observation that all choices of a local range lead 
to a small residual Ap t offset: the background subtraction typically leaves a | (Ap t ) | < 1 GeV at 
both RHIC and LHC, i.e. better than 1-2% accuracy over much of the pt range of interest. It is not 
clear, within this level of accuracy, if one range is to be preferred to another, nor is it always easy 
to identify the precise origins of the observed differences between various ranges@ Another way 

Independently of the choice made for the full event, we always use a global range up to \y\ = j/ max for the 
determination of p in the hard event, without exclusion of any jets. This ensures that the reference jet pt is always 
kept the same. The impact of subtraction in the hard event is in any case small, so the particular choice of range is 
not critical. 

12 Furthermore, the differences may also be modified by jet-medium interactions. 
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Figure 5: Distribution of Apt (red histograms) for each of our 4 jet algorithms, together with a 
Gaussian (black curve) whose mean (solid vertical line) and dispersion are equal to (Apt) and &Apt 
respectively. 



of viewing this is that the observed differences between the various choices give an estimate of the 
residual subtraction error due to possible misestimation of p. For our particular analysis, at RHIC 
this comment also applies to the choice of the global range (with the exclusion of the two hardest 
jets in the event). This is a consequence of the limited rapidity acceptance, which effectively turns 
the global range into a local one, a situation that does not hold for larger rapidity acceptances, as 
we have seen for the LHC results. In what follows we will use the Doughnut (R, 3R) choice, since it 
provides a good compromise between simplicity and effectiveness. 



4.3 Choice of algorithm 

The next potential systematic effect that we consider is the choice of the jet algorithm used for the 
clustering^] Fig. [5] shows the distribution of Apt for each of our four choices of jet algorithm, kt, 
C/A, anti-kt and C/A(filt), given for RHIC collisions and a specific bin of the hard jets' transverse 
momenta, 30 < Pt bard < 35 GeV. One sees significant differences between the different algorithms. 
One also observes that Gaussians with mean and dispersion set equal to (Apt) and &/\p t provide a 
fair description of the full histograms. This validates our decision to concentrate on (Apt) and (TAp t 
as quality measures. One should nevertheless be aware that in the region of high \Apt\ there are 
deviations from perfect Gaussianity, which are more visible if one replicates fig.[5]with a logarithmic 
vertical scale (not shown, for brevity). 



Recall that in all cases, the kt algorithm with R p = 0.5 is used for the estimation of the background. 
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Figure 6: Average shift (Ap t ), as a function of Pi,hard> shown for RHIC (left) and the LHC (right). 
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Figure 7: Contribution to {Apt) due to back-reaction. Note that these results are independent of 
the range used for estimating p in the heavy- ion event. Left: RHIC, right: LHC. 

4.3.1 Average Ap t 

The first observable we analyse is the average pt shift. We show in fig. [6] the (Apt) results for the 
four algorithms listed in section [3~4"1 as a function of ffyhard- We use the doughnut range to estimate 
the background. The first observation is that, while the anti-fe and C/A(filt) algorithms have a 
small residual (Apt), the C/A and kt algorithms display significant offsets. The reason for the large 
offsets of kt and C/A is well understood, related to an effect known as back-reaction [15J. This is 
the fact that the addition of a soft background can alter the clustering of the particles of the hard 
event: some of the constituents of a jet in the hard event can be gained by or lost from the jet when 
clustering the event with the additional background of the full event. This happens, of course, on 
top of the simple background contamination that adds background particles to the hard jet. Even 
if this latter contamination is subtracted exactly, the reconstructed pt will still differ from that of 
the original hard jet as a consequence of the back-reaction. 

The effect of the back-reaction can be studied in detail, since in Monte Carlo simulations it is 
possible to identify which hard-event constituents are present in a given jet before and after inclusion 
of the background particles in the clustering. The average pt shift due to back-reaction can be seen 
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Figure 8: Dispersion a/\ Pt . Left: RHIC, right: LHC. 



in fig. [7] for the different jet algorithms. As expected |15l l3]. it is largest for kt, and smallest (almost 
zero, in fact) for anti-fcf . By comparing fig. [7J and fig. [6] one can readily explain the difference 
between the (Apt) offsets of the various algorithms in terms of their back-reaction. The rigidity 
(and hence small back-reaction) of the anti-fc^ jets manifestly gives almost bias-free reconstructed 
jets, while the large back-reaction effects of the kt algorithm and, to a smaller extent, of the C/A 
algorithm translates into a worse performance in terms of average shift. The pt dependence of the 
back-reaction is weak. This is expected based on the interplay between the lnlnpt dependence 
found in [15J and the evolution with pt of the relative fractions of quark and gluon jets. 

The case of the C/A(filt) algorithm is more complex: its small net offset, comparable to that of 
the anti-kt algorithm, appears to be due to a fortuitous compensation between an under-subtraction 
of the background and a negative back-reaction. The negative back-reaction is very similar to that 
of C/A without filtering, while the under-subtraction is related to the fact that the selection of the 
hardest subjets introduces a bias towards positive fluctuations of the background. This effect is 
discussed in Appendix[Bl where we obtain the following estimate for the average pt shift (specifically 
for R m = R/2): 

((Ap t )fiit) ~0.56i?rr, (8) 

yielding an average bias of 2 GeV for RHIC and 4.5 GeV at the LHC, which are both in good 
agreement with the differences observed between C/A with and without filtering in fig. [6l Note that 
while the bias in eq. ([8]) is proportional to a, the back-reaction bias is instead mainly proportional 
to p |15j . It is because of these different proportionalities that the cancellation between the two 
effects should be considered as fortuitous. Since it also depends on the substructure of the jet, 
one may also expect that the cancellation that we see here could break down in the presence of 
quenching. 

4.3.2 Dispersion of Ap t 

Our results for the Ap t dispersion, <7A Pt , are shown in fig. El again using the doughnut range. (Our 
conclusions are essentially independent of the particular choice of range.) 

We first discuss the case of RHIC kinematics. For kt and anti-fcj, the observed dispersions are 
similar to the result of 6.8 GeV quoted by STAR [18] (though the number from STAR includes 
detector resolution effects, so that the true physical a& Pt may actually be somewhat lower). Of 



15 



> 

O 



Q- 
< 



10 



1 r 

solid: anti-k t 



i r 

0-10% 



8 - 



dashed: C/A(filt) 10-20% 



6 - * 



2 h 




20-40% 
40-75% 



H ' 


X 


— X — 


X 


X 




X — 


h 


X 














X 







































RHIC, unquenched, 
i i L 



|<1, R=0.4, Doughnut(R,3R) 
i i i i 



10 15 20 25 30 35 
Pt,hard t GeV ] 



40 45 



50 



Figure 9: pt dependence dependence of the Ap t dispersion at RHIC, for different centrality classes. 

note, the advantage enjoyed by anti-fct in terms of smallest (Apt) does not hold at the level of the 
dispersion: C/A and kt tend to behave slightly better at small transverse momentum. The algorithm 
which performs best in terms of dispersion over all the pt range is now C/A with filtering, for which 
the result is smaller than that of the other algorithms by a factor of about l/\/2- This reduction 
factor can be explained because the dispersion a& pt is expected to be proportional to the square- 
root of the jet area: the C/A(filt) algorithm with R^\ t = R/2 and ngi t = 2 produces jets with an 
area of, roughly, half that obtained with C/A; hence the observed reduction of (J/^ pt . 

In the LHC setup, the conclusions are quite similar at the lowest transverse momenta shown. 
As pt increases, the dispersion of the anti-kt algorithm grows slowly, while that of the others grows 
more rapidly, so that at the highest pts shown, the kt and C/A algorithms have noticeably larger 
dispersions than anti-fc^, and C/A(filt) becomes similar to anti-fc^. The growth of the dispersions 
can be attributed to an increase of the back-reaction dispersion. The latter is dominated by rare 
occurrences, where a large fraction of the jet's pt is gained or lost to back-reaction, hence the 
noticeable pt dependence (c/. appendix IC.1|) . An additional effect, especially for the kt algorithm, 
might come from the anomalous dimension of the jet e. the growth with pt of the average 

jet area. 

Note that the dispersion has some limited dependence on the choice of ghost area — for example, 
reducing it from 0.01 to 0.0025 lowers the dispersions by about 0.2 — 0.4 GeV at RHIC. This is 
discussed further in appendix IC.2I 

4.4 Centrality dependence 

So far, we have only considered central collisions. Since it is known that non-central collisions 
give rise to elliptic flow |36|, [37] , one might worry that this leads to an extra source of background 
fluctuations and/or non-uniformities, potentially spoiling the subtraction picture discussed so far. 
One can study this on azimuthally averaged jet samples (as we have been doing so far) or as a 
function of the azimuthal angle, A(j), between the jet and the reaction plane. As above, we use 
HYDJET vl.6, whose underlying HYDRO component includes a simulation of elliptic flow |38j . 
We have generated heavy-ion background events for RHIC in four different centrality bins: 0- 
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Figure 10: (j) dependence of the Apt shift at RHIC, for the — 10% centrality bin. Left: for 
three different ranges for the anti-fci algorithm; right: for three different jet algorithms for the 
doughnut (R, 3R) range. The absolute size of the 4> dependence is similar for centralities up to 40% 
and then decreases beyond. 



10% (as above), 10-20%, 20-40% and 40-75%, with v 2 values respectively of 1.7%, 3.3%, 5.0% and 
5.3%0 

We first examine azimuthally averaged results, repeating the studies of the previous sections 
for each of the centrality bins. We find that the results for the average shift, (Apt), are largely 
independent of centrality, as expected if the elliptic flow effects disappear when averaged over (f). 
The results for the dispersion are shown in fig. [9] We observe that the dispersion decreases with 
increasing non-centrality. Even though one might expect adverse effects from elliptic flow, the 
heavy-ion background decreases rapidly when one moves from central to peripheral collisions, and 
this directly translates into a decrease of o"A Pt • 

The first conclusion from this centrality-dependence study is therefore that the subtraction 
methods presented in this paper appear to be applicable also for azimuthally averaged observables 
in non-central collisions. 

We next consider results as a function of A<f), which is relevant if one wishes to examine the 
correlation between jet quenching and the reaction plane. An issue in real experimental studies 
is the determination of the reaction plane, and the extent to which it is affected by the presence 
of hard jets. In the HYDJET simulations, this problem does not arise because the reaction plane 
always corresponds to <fi = 0. Figure [10] (left) shows the average Apt as a function of Acj) for the 
anti-fct algorithm and several different background-estimation ranges, for the — 10% centrality bin 
for RHIC. The strip range shows significant Acp dependence, which is because a determination of p 
averaged over all 4> cannot possibly account for the local (^-dependence induced by the elliptic flow. 
Other ranges, such as the doughnut range, instead cover a more limited region in <j). They should 
therefore be able to provide information on the (^-dependence of the background^ However, since 
their extent in cj> tends to be significantly larger than that of the jet, and p varies relevantly over 
that extent, some residual <p dependence remains in (Apt) after subtraction. The right-hand plot 
of figure [10] shows that the effect is reduced with filtering, as is to be expected since its initial 

14 V2 was determined as the average of cos 20 over all all particles with \r)\ < 1 (excluding the additional hard pp 
event). 

At the expense of being more strongly affect by jet-medium interactions that could manifest themselves as broad 
enhancement of the energy flow in the vicinity of the jet. 
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background contamination is smaller. The conclusion from this part of the study is that residual in- 
dependent offsets may need to be corrected for explicitly in any studies of jets and their correlations 
with the reaction plane. The investigation of extensions to our background subtraction procedure 
to address this issue will be the subject of future work. 

4.5 Quenching effects 

The last issue we wish to investigate is how the phenomenon of jet quenching (i.e. medium effects on 
parton fragmentation) may affect the picture developed so far. The precise nature of jet quenching 
beyond its basic analytic properties (see e.g. [21]) is certainly hard to estimate in detail, especially 
at the LHC, where experimental data from, say, flow or particle spectra measurements are not 
yet available for the tuning of the Monte Carlo simulations. Additionally, the implementation of 
Monte-Carlo generators that incorporate the analytic features of jet quenching models is currently 
a very active field (2H [391 HQ]. We may therefore expect a more robust and complete picture of jet 
quenching in the near future, together with the awaited first PbPb collisions at the LHC. 

In this section, we examine the robustness of our HI background subtraction in the presence of 
(simulated) jet quenching For this purpose we have used two available models which allow one 
to simulate quenched hard jets, PYQUEN [19], which is used by HYDJET vl.6, and QPYTHIA 
|24| . PYQUEN has been run with the parameters listed in footnote [7] for the LHC, and with 
TO = 0.5 GeV, tauO = 0.4 fm and nf = 2 for RHIC0. For QPYTHIA we have tested two options 
for the values of the transport coefficient and the medium length (q = 3 GeV 2 /fm, L = 5 f m and 
q = 1 GeV 2 /fm, L = 6 fm), with similar results. No serious attempt has been made to tune the two 
codes with each other or with the experimental data, beyond what is already suggested by the code 
defaults: in the absence of strong experimental constraints on the details of the quenching effects, 
this allows us to verify the robustness of our results for a range of conditions. 

As in sections 14.21 and 14.31 we have embedded the hard PYQUEN or QPYTHIA events in a 
HYDJET vl.6 background and tested the effectiveness of the background subtraction for different 
choices of algorithm! 18 ! We shall restrict our attention to the anti-fct and C/A(filt) algorithms, as 
they appear to be the optimal choices from our analysis so far. 

We have found that the jet-matching efficiencies are still high, with essentially no changes 
at RHIC, and at LHC a doubling of the (small) inefficiencies that we saw in fig. fright). The 
dispersions cr^ pt are also not significantly affected within our sample of jet-quenching simulations. 
We therefore concentrate on the (Apt) offset, which is plotted in fig. [11] for PYQUEN. The results 
are shown for both RHIC and the LHC. In the case of RHIC, and for the whole pt range up to 
about 50 GeV, quenching can be seen not to significantly affect the subtraction offset (Apt) (within 
the usual uncertainty related to the choice of range, which was shown in fig. H]). In the LHC case, 
instead, while the shift obtained using the anti-kt algorithm is largely similar to the unquenched 
case, the C/A(filt) algorithm performance can be seen to deteriorate slightly when quenching is 
turned on, all the more so at very large transverse momentum. The kt and the C/A algorithms are 
not shown for clarity, but they share the behaviour of C/A(filt). This deterioration of the quality of 
the subtraction can be traced back to an increased back-reaction compared to the unquenched jets. 

16 Our focus here is therefore not the study of quenching itself, but merely how it may affect our subtraction 
procedure. 

17 The parameters for RHIC are taken from [19] , The difference with the default parameters does not appear to be 
large for the purpose of our investigations and, in any case, a systematic study of quenching effects is not among the 
goals of this paper. 

18 The effect of the choice of range remains as in section 14.21 for the unquenched case. We will therefore keep 
employing the doughnut range. 
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Figure 11: Average pt shift for background-subtracted jets with the anti-fct and C/A(filt) jet al- 
gorithms. The dashed lines correspond to unquenched hard jets (PYTHIA) and the solid ones to 
quenched hard jets (PYQUEN). Results are shown for RHIC kinematics on the left plot and for 
the LHC on the right one. 



Anti-kt jets do not suffer from this effect as a consequence of the usual rigidity of this algorithm. In 
the case of C/A(filt), one should nevertheless emphasise that an error of (at most) 10 GeV on the 
reconstruction of a 500 GeV jet is still only a 2% effect. This is modest, both relative to the likely 
experimental precision and to the expected effect of quenching on the overall jet pt, predicted by 
PYQUEN to be at the level of 10% at this p t . 

Though for brevity we have not explicitly shown them, the results with QPYTHIA are very 
similar. 

Before closing this section, we reiterate that we have only investigated simple models for quench- 
ing and that our results are meant just to give a first estimate of the effects that one might have to 
deal with in the case of quenched jets. The expected future availability of new "quenched" Monte 
Carlo programs, together with specific measurements in the early days of heavy-ion collisions at 
the LHC, will certainly allow one to address this question more extensively. 



4.6 Relative importance of average shift and dispersion 

To close this section, we examine the relative importance of the average shift and its dispersion, tak- 
ing the illustrative example of their impact on the inclusive jet cross-section as a function of pt- We 
start from a simple parametrisation of the inclusive-jet pt spectrum and see how its reconstruction 
is affected by the average shift and dispersion that remain for Apt after subtraction. 
Let us assume that the true pt spectrum decays exponentially i.e. 

^ = (9) 
dpt 

While this expression doesn't have the 1/p" form that one expects to see, it is far easier to handle 
analytically, and not too poor an approximation to observed spectra over quite a broad range of pt ■ 
After embedding the hard events in a heavy-ion background and applying subtraction, the resulting 
spectrum will be the convolution of eq. Q with the Apt distribution. Assuming that the latter is 
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a Gaussian of average (Apt) and dispersion a& pt (cf. fig. [5]), one obtains a reconstructed spectrum 



_ 5 _=exp^(A Pt) + ^j— . (10) 

Thus the average shift gives a bias by a multiplicative factor exp(A (Ap t )) and the dispersion by a 
factor exp(A 2 <7 Apt /2). The convolution works in such a way that for a given reconstructed pt, the 
most likely original true transverse momentum is: 

most likely p p t p ~ p t A,sub - (Ap t ) - Xo\ pt , (11) 

where we have neglected the small impact of subtraction on the pp jets. 

To illustrate these effects quantitatively, let us first take the example of RHIC, where between 
10 and 60 GeV, the cross-section is well approximated by eq. Q with A = 0.3 GeV -1 . Both the 
anti-fct and C/A(filt) have {Apt) — 0, leaving only the dispersion effect. In the case of the anti-&t 
(respectively C/A(filt)) algorithm, we see from fig. [8]that a& Pt ~ 7.5 GeV (4.8 GeV), which gives a 
multiplicative factor of about 12 (3). For a given reconstructed pt, the most likely true pt is about 
17 GeV (7 GeV) smaller. In comparison, for C/A (kt), with (Ap t ) ~ -1.5 GeV (-3.5 GeV) (fig. ED 
and o"A Pt — 6.5 GeV (similar for kt) there is a partial compensation between factors of 0.64 (0.35) 
and 6.7 coming respectively from the shift and dispersion, yielding an overall factor of about 4 (2.3), 
while the most likely true pt is about 14 GeV (12 GeV) smaller than the reconstructed pt- 

At the LHC (y / ijv r /v = 5.5 TeV), eq. ([9]) is a less accurate approximation. Nevertheless, for p t ~ 
100 — 150 GeV, it is not too unreasonable to take A = 0.05 GeV" 1 and examine the consequences. 
For anti-A;t (respectively C/A(filt)), we have (Apt) ~ (also for C/A(filt)) and <7A Pt — 18 GeV 
(13 GeV), giving a multiplicative factor of 1.5 (1.2), i.e. far smaller corrections than at RHIC. For 
a given reconstructed pt, the most likely true pt is about 16 GeV (8 GeV) smaller, rather similar to 
the values we found at RHIC (though smaller in relative terms, since the pt's are higher), with the 
increase in a being compensated by the decrease in A. 

In the LHC case, it is also worth commenting on the results for the kt algorithm, since this is 
what was used in ref. [14]: we have (Apt) — —8 GeV and CA Pi — 18 GeV, giving a multiplicative 
factor of 1.05, which is consistent with the near perfect agreement that was seen there between the 
pp and subtracted AA spectra. That agreement does not however imply perfect reconstruction, 
since the most likely p\ v is about 7 GeV lower than p AA,sub . 

Though the above numbers give an idea of the relative difficulties of using different algorithms at 
RHIC and LHC, experimentally what matters most will be the systematic errors on the correction 
factors (for example due to poorly understood non-Gaussian tails of the Apt distribution). Note 
also that a compensation between shift and dispersion factors, as happens for example with the 
C/A algorithm, is unlikely to reduce the overall systematic errors. 



5 The issue of fakes 

While the goal of this paper is not to discuss the issue of "fake-jets" in detail, it is a question that 
has been the subject of substantial debate recently (see for example (9J [4Tj ) . Here, therefore, we 
wish to devote a few words to it and discuss how it relates to our background-subtraction results 
so far. 

In a picture in which the soft background and the hard jets are independent of each other, one 
way of thinking about a fake jet is that it is a reconstructed jet (with significant pt) that is due not to 
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the presence of an actual hard jet, but rather due to an upwards fluctuation of the soft background. 
The difficulty with this definition is that there is no uniquely-defined separation between "hard" 
jets and soft background. This can be illustrated with the example of how HYDJET simulates 
RHIC collisions: one event typically consists of a soft HYDRO background supplemented with 
~ 60 pp collisions, each simulated with a minimum p t cut of 2.6 GeV on the 2 — >• 2 scattering. To 
some approximation, the properties of the full heavy-ion events remain relatively unchanged if one 
modifies the number of pp collisions and corresponding p t cut and also retunes the soft background. 
The fact that this changes the number of hard jets provides one illustration of the issue that the 
soft/hard separation is ill-defined. Additionally, while there are ~ 60 semi-hard pp collisions (~ 120 
mostly central semi-hard jetJ^l) in an event, there is only space within (say) the acceptance of RHIC 
for O (40) jets. Thus there is essentially no region in an event which does not have a semi-hard jet. 
From this point of view, every reconstructed jet corresponds to a (semi-)hard pp jet and there are 
no fake jets at all. 



5.1 Inclusive analyses 

For inclusive analyses, such as a measurement of the inclusive jet spectrum, this last point is 
particularly relevant, because every jet in the event contributes to the measurement. Then, the 
issue of fakes can be viewed as one of unfolding. In that respect it becomes instructive, for a given 
bin of the reconstructed heavy-ion pt, to ask what the corresponding matched pp jet transverse 
momenta were. Specifically, we define a quantity 0(pf A ' suh ,p PP ), the distribution of the pp "origin", 
Pt P , of a heavy-ion jet with subtracted transverse momentum pf ' sub . If the origin 0(p t ' bnb ,p PP ) 
is dominated by a region of p PP of the same order as p t ' sub , then that tells us that the jets being 
reconstructed are truly hard. If, on the other hand, it is dominated by p PP near zero, then that is a 
sign that apparently hard heavy-ion jets are mostly due to upwards fluctuations of the background 
superimposed on low-pt pp jets, making the unfolding more delicate. 

In figure [12] we show the origins of heavy-ion jets as determined in our HYDJET simulations 1^1 
The upper row provides the origin plots for anti-fct jets at RHIC. Each plot corresponds to one bin 
of pf A ' bnh , and shows 0(p AA ' suh ,j?f p ) as a function of p\ p . At moderate pf A,snh , the 25 — 30 GeV bin, 
the origin is dominated by low p PP . This is perhaps not surprising, given the result in section [4~6l that 
the Pt P origin is expected to be ~ 17 GeV lower than p t ,sub for anti-A^ jets — additionally, that 
result assumed an exponential spectrum for the inclusive jet distribution, whereas the distribution 
rises substantially faster towards low p PP . As p AA ' suh increases one sees that the contribution of 
high p\ p jets increases, in a manner not too inconsistent with the expected ~ 17 GeV shift, though 
the p PP distribution remains rather broad and a peak persists at small p PP . These plots suggest that 
an inclusive jet distribution measurement with the anti-fcj algorithm at RHIC is not completely 
trivial since, up to rather large p AA ' suh , one is still sensitive to the jet distribution at small values of 
p PP where the separation between "hard" jets and the soft medium is less clear. Nevertheless, two 
points should be kept in mind: firstly, the upper row of fig. [12] shows that different p t ' sub have 
complementary sensitivities to different parts of the Pf P spectrum. Thus it should still be possible 
to "unfold" the p AA,snb distribution to obtain information about the pP p ', unfolding being in any 



19 Specifically, keeping in mind the HYDJET simulation, one can cluster each pp event separately to obtain a long 
list of pp jets from all the separate hard events. 

20 A point to be aware of is that multiple pp jets can match a single heavy-ion jet, i.e. have at least half their pt 
contained in the heavy-ion jet. In evaluating 0(pf' A ' suh ,p^ p ) we take only the highest-pt matched jet. If there is 
no matched jet (this occurs only rarely) then we fill the bin at p v t v = 0. Note also that since we are not explicitly 
embedding hard jets, all pp jets in the events have undergone HYDJET's quenching. 
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Figure 12: The pt distribution of the pp jet corresponding to a given bin of reconstructed heavy-ion 
jet pf A ' sub at RHIC, i.e. 0(p? A ' sub \pf) as a function of p^ p for a given bin of p t ' hU . The upper 
row is for the anti-/ct algorithm, while the lower row is for C/A(filt). Each column corresponds to a 
different p AA ' suh bin, as indicated by the vertical band in each plot. Cases in which the histogram 
is broad or peaked near are indicative of the need for special care in the unfolding procedure. 
These plots were generated using approximately 90 million events. Each plot has been normalised 
to the number of events in the corresponding p AA ' snb bin. 
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Figure 13: Same as fig.[T2]for LHC kinematics (PbPb, y/s = 5.5 TeV), generated with approximately 
16 million events. Note the use of a smaller rapidity range here, \y\ < 1, compared to the earlier 
LHC plots. 
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case a standard part of the experimental correction procedure. Secondly STAR quotes [TS] a 10% 
smaller value for a& pt than the 7.5 GeV that we find in HYDJET. Such a reduction can make it 
noticeably easier to perform the unfolding. 

The impact of a reduction in a/\ Pt is illustrated in the lower row of fig. [T2l which shows the 
result for C/A(filt). Here, even the 25 — 30 GeV bin for p AA,snh shows a moderate-pt peak in the 
distribution of p^ p , and in the 35 — 40 GeV bin the low-pt "fake" peak has disappeared almost 
entirely. Furthermore, the peak is centred about 7 GeV lower than the centre of the p t ' snb 
bin, remarkably consistent with the calculations of section 14.61 Overall, therefore, unfolding with 
C/A(filt) will be easier than with anti-fc^. 

Corresponding plots for the LHC are shown in fig. [T3j While C/A(filt)'s lower dispersion still 
gives it an advantage over anti-£^, for p AA ' suh > 80 GeV, anti-£^ does now reach a domain where 
the original pp jets are themselves always hard. 

Procedures to reject fake jets have been proposed, in [TQl CD]. They are based on a cut on 
(collinear unsafe) jet shape properties and it is thus unclear how they will be affected by quenching 
and in particular whether the expected benefit of cutting the low-p PP peak in figs. [12] and IT31 
outweighs the disadvantage of potentially introducing extra sources of systematic uncertainty at 
moderate pt- 

One final comment is that experimental unfolding should provide enough information to produce 
origin plots like those shown here. As part of the broader discussion about fakes it would probably 
be instructive for such plots to be shown together with the inclusive-jet results. 



5.2 Exclusive analyses 

An example of an exclusive analysis might be a dijet study, in which one selects the two hardest jets 
in the event, with transverse momenta pn and^2, and plots the distribution of \Ht$ = \{Pti+Pt2)- 
Here one can define "fakes" as corresponding to cases where one or other of the jets fails to match to 
one of the two hardest among all the jets from the individual pp events. This definition is insensitive 
to the soft/hard boundary in a simulation such as HYDJET, because it naturally picks out hard 
pp jets that are far above that boundary. And, by concentrating on just two jets, it also evades the 
problem of high occupancy from the large multiplicity of semi-hard pp collisions. This simplification 
of the definition of fakes is common to many exclusive analyses, because they tend to share the 
feature of identifying just one or two hard reference jets. 

The specific case of the exclusive dijet analysis has the added advantage that it is amenable to 
a data-driven estimation of fakes. One divides the events into two groups, those for which the two 
hardest jets are on the same side (in azimuth) of the event and those in which they are on opposite 
sides (a related analysis was presented by STAR in ref. |42j). For events in which one of the two 
jets is "fake," the two jets are just as likely to be on the same side as on the opposite side. This is 
not the case for non-fake jets, given that the two hardest "true" jets nearly always come from the 
same pp event and so have to be on opposite sides Thus by counting the number of same-side 
versus opposite-side dijets in a given \Ht,2 bin, one immediately has an estimate of the fake rate@ 

This is illustrated in fig. [Til which shows the distribution at RHIC of the full, subtracted 

21 At RHIC energies, above pt ~ 10 — 15 GeV, it is nearly always the case that the two hardest jets come from the 
same pp event. At the LHC, this happens above 20 — 30 GeV. 

22 Note that for plain pp events, if one has only limited rapidity acceptance then the same-side/opposite-side sepa- 
ration is not infrared safe, because of events in which only one hard jet is within the acceptance and the other "jet" 
is given by a soft gluon emission. Thus to examine the same-side/opposite-side separation in plain pp events with 
limited acceptance, one would need to impose a p t cut on the second jet, say pta > hpt,i- 
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Figure 15: Same as fig. [TJ] for LHC kinematics. 



2-£f:r,2 result, together with its separation into opposite-side and same-side components. One sees 
that in the peak region, the opposite and same-side distributions are very similar, indicating a 
predominantly "fake" origin for at least one of two hardest jets (they are not quite identical, because 
there is less phase-space on the same side for a second jet than there is on the away side). However 
above a certain full, reconstructed \Ht$ value, about 30 GeV for anti-fci and 20 GeV for C/A(filt) 
the same-side distribution starts to fall far more rapidly than the opposite-side one, indicating that 
the measurement is now dominated by "true" pairs of jets. 

The LHC results, fig. [T5| are qualitatively similar, with the same-side spectrum starting to fall 
off more steeply than the opposite-side one around 70 GeV for the ant\-k t algorithm and 50 GeV 
for C/A(filt). 

One can also examine origin plots for Ht,2, in analogy with the Monte Carlo analysis of sec- 
tion 15. li For brevity, we refrain from showing them here, and restrict ourselves to the comment 
that in the region of -Ht,2 where the result is dominated by opposite-side pairs, the origin plots are 
consistent with a purely hard origin for the dijets. 
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6 Conclusions 



In this article we have presented the results of a systematic study of heavy-ion jet reconstruction 
with jet-area based background subtraction, building on the brief initial proposal of [T4"] . 

The questions we have examined include those of the choice of range for estimating the back- 
ground, the choice of algorithm for the jet finding and the robustness of the reconstruction with 
respect to quenching effects and collision centrality. 

We have found that there is little difference between various ranges, as long as they are chosen 
to be localised to the vicinity of the jet of interest and of sufficient size (at least 4 units of area 
for jets with R = 0.4). In comparing different algorithms we examined the systematic offset and 
the dispersion in the reconstructed jet pt- The offset can be brought close to zero by using the 
anti-A^ algorithm, while the kt algorithm has the largest offset; the Cambridge/ Aachen (C/A) 
algorithm with filtering also gives a small offset, however this seems to have been due to a fortuitous 
cancellation between two only partially related effects. The dispersion is comparable for anti-fc^, 
C/A and kt, but significantly smaller for C/A(filt) (except at high transverse momenta for LHC), 
as a consequence of its smaller jet area. Among the different algorithms, anti-A^ is the most robust 
with respect to quenching effects, and C/A(filt) seems reasonably robust at RHIC, though a little 
less so at the LHC. The precise numerical results for offset and dispersion can depend a little on 
the details of the simulation and the analysis, however the general pattern remains. 

Overall our results indicate that the area-based subtraction method seems well suited for jet 
reconstruction in heavy-ion collisions. Two jet- algorithm choices were found to perform particularly 
well: anti-fc^, which has small offsets but larger fluctuations, and C/A with filtering, for which the 
offsets may be harder to control, but for which the fluctuations are significantly reduced, with 
consequent advantages for the unfolding of experimentally measured jet spectra. Ultimately, we 
suspect that carrying out parallel analyses with these two choices may help maximise the reliability 
of jet results in HI collisions. 
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A Estimate of the minimal size of a range 
A.l Fluctuations in extracted p 

In section [331 we gave an estimate of the minimum size of a range one should require for determining 
p, given a requirement that fluctuations in the determination of p should be moderate. We give the 
details of the computation in this appendix. 
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We start from the fact that the error made on the estimation of the background density p 
will translate into an increase of the dispersion <J/± Pt : on one hand, the dominant contribution 
to o"Ap t comes from the intra-event fluctuations of the background i.e. is of order a^jA^ with 
Ajet the jet area; on the other hand, the dispersion S& p of the misestimation of p leads to an 
additional dispersion on the reconstructed jet p t of S^ p A^ et . Adding these two sources of dispersion 
in quadrature and using the result from section 3.4 of [IB] , Sa p — y / tt 7 '(2An)a with An the area 
of the range under consideration, we get 




a Apt ~ a A jet + — P • (12) 



If we ask e.g. that the contribution to the total dispersion coming from the misestimation of the 
background be no more than a fraction e of the total cr^ pt , then we obtain the requirement 

An>A min ^^. (13) 

For anti-fct jets of radius R, with Aj e t — TtR 2 , this translates to 

7T 2 /? 2 

^25i? 2 , (14) 

4 e 

where the numerical result has been given for e = 0.1. For R = 0.4, it becomes An > 4. 

We can also cast this result in terms of the number of jets that must be present in 1Z. Assuming 
that the jets used to estimate p have a mean area of 0.557r-R 2 i£l we find a minimal number of jets , 

Atnin 1.4 R . > 

(15) 



min ~ 0.55vri? 2 ~ e Ry 

where, as before, we have taken Aj et ~ ttR 2 . Taking the numbers quoted above and R p = 0.5, as 
used in the main body of the article, this gives n m \ a ~ 9. 

A. 2 Hard-jet bias in extracted p 

From section 3.3 of [16], we know that the presence of hard jets and initial-state radiation leads to 
a bias in the extraction of p of 



(Ap)caR p ,I^L^L (16) 



^rcj (n h ) 
2 An 

where cj ~ 2 is a numerical constant and (rih) is the average number of "hard" jets (those above 
the scale of the background fluctuations, including initial-state radiation); (n^) is given by 



(jih)_^J^ + C i ^_ L = ln MV£>M (17) 

An An vr 2 b ' ' a s (p t ) 

where rif, is the number of "Born" partons from the underlying 2—7-2 scattering that enter the 
region 1Z, while 60 = (IICa — 2n/)/(127r) is the first coefficient of the QCD /3-function. In the 
context of [16j, principally directed towards a study of the UE in pp collisions, a was rather small, 



23 This is the typical area one would obtain using a (strongly recommended) jet definition like the k t or C/A 
algorithms for the background estimation |15] . 
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causing the second term of eq. (|17p . associated with initial-state radiation above a scale ~ crR p , to 
be comparable in size to the first term. In our case, a is significantly larger and this reduces the 
impact of the second term sufficiently that we can ignore it. We thus arrive at the result 

Note that the presence of hard jets and initial-state radiation also affects the fluctuations in the 
misestimation of p and this should in principle have been included in the estimates of appendix I A. li 
However, while the effect is not completely negligible, to within the accuracy that is relevant for us 
(a few tens of percent in the estimation of a minimal Afi) it does not significantly alter the picture 
outlined there. 



B Subtraction bias due to filtering 

We have seen from fig. [U] in section 1431 that the subtraction differs when we use the C/A algorithm 
with and without filtering. Since this difference is not due to back-reaction (see fig. E]), it has to be 
due to the subtraction itself. 

The difference comes from a bias introduced by the selection of the two hardest subjets during 
filtering. The dominant contribution comes when only one subjet, that we shall assume harder 
than all the others, contains the hard radiation, all the other subjets being pure background. In 
that case, the selection of the hardest of these pure-background subjets as the second subjet to be 
kept tends to pick positive fluctuations of the background. This in turn results in a positive offset 
compared to pure C/A clustering, as observed in section 14.31 

To compute the effect analytically, let us thus assume that we have one hard subjet and nbkg 
pure-background subjets of area A g = 0.557nRg lt [15]. After subtraction, the momentum of each of 
the pure-background subjet can be approximated as having a Gaussian distribution of average zero 
and dispersion a^J A g . Assuming that the "hard" subjet's transverse momentum remains larger 
than that of all the background jets, the 2 subjets that will be kept by the filter are the hard subjet 
(subtracted) and the hardest of all the subtracted background jets. The momentum distribution 
of the latter is given by the maximum of the nbkg Gaussian distributions O We are only interested 
here in computing the average bias introduced by the filtering procedure, which is then given by 

((Apt)fiit) - y II ( dpt ' k /2^A~a G ) ( Pt ' 1 ' ' ' ' ,Pt ' nbk ^ 



k=l 



For the typical case i?mt = R/2 and nbkg = 3, one finds 



((Ap t ) mt ) ~ * _g ~ 0.56 Ext. (19) 

If we insert in that expression the typical values for the fluctuations quoted in sectionHJand R = 0.4, 
we find average biases of 2 GeV for RHIC and 4.5 GeV at the LHC, which are in good agreement 
with the differences observed between C/A with and without filtering in fig. [6l 



24 Within Fast Jet's filtering tools, when the subtracted transverse momentum of a subjet is negative, the subjet is 
assumed to be pure noise and so discarded. This means that the momentum distribution of the hardest subtracted 
background jet is really given by the distribution of the maximum of the n Gaussian-distributed random numbers, 
but with the result replaced by zero if all of them are negative. In the calculations here we ignore this subtlety, since 
we will have n = 3 and only l/8 th of the time are three Gaussian-distributed random numbers all negative. 
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Figure 16: The decomposition of the dispersion into back-reaction and "background" components 
(including misestimation of p). The left-hand plot is for the anti-fct algorithm and the right-hand 
one for C/A(filt). Both correspond to LHC collisions at ^snn = 5.5 TeV. 

C Contributions to dispersion 

C.l Back reaction versus background fluctuations 

We stated, in section 14.3.21 that the increase of the dispersion at high pt seen in fig. [8] was mainly 
due to back-reaction. This is made explicit in fig. [T6l which decomposes the dispersion into its two 
components: that associated with the back-reaction, <r^ t and that associated with background 

fluctuations and misestimation of p (defined as [c 2 \ pt — (c^J 2 ]^)- One sees that the background- 
fluctuation component is essentially independent of pt, while the back-reaction dispersion has a 
noticeable pt dependence. This is the case because the back-reaction dispersion is dominated by 
rare events in which two similarly hard subjets are separated by a distance close to R (specifically 
by R + e with e < 1). In such a configuration, the background's contribution to the two subjets 
can affect whether they recombine and so lead to a large, O (pt), change to the jet's momentum. 
In the limit of a uniform background, a/p <C 1, this can be shown to occur with a probability of 
order a s pR 2 /pt- Thus the contribution to the average shift (Apt) is proportional to a s pR 2 (which 
in a full analysis is found to be enhanced by a logarithm [15] for the kt and C/A algorithms), 
while the contribution to (Ap|) goes as a s pR 2 pt, and so leads to a dispersion that should grow 
asymptotically as ^Jpl (fig. HFJis, however, probably not yet in the asymptotic regime). 

It is worth keeping in mind that even though rare but large back-reaction dominates the overall 
dispersion, it will probably not be the main contributor in distorting the reconstructed jet spectrum. 
Such distortions come from upwards Apt fluctuations, whereas large back-reaction tends to be 
dominated by downwards fluctuations. The reason is simple: in order to have an upwards fluctuation 
from back-reaction, there must be extra pt near the jet in the original pp event. This implies the 
presence of a harder underlying 2 — > 2 scattering than would be deduced from the jet pt, with a 
corresponding significant price to pay in terms of more suppressed matrix elements and PDFs. 

Figure [TBI also shows that the non-back-reaction component is nearly independent of pt- This is 
expected since the anomalous dimension of the jet area is zero for anti-kt and small for C/A (with 
or without filtering), and in any case leads to a weak scaling with pt, as lnlnpt- Furthermore, there 
is roughly a factor of l/v2 between anti-/cf and C/A(filt), as expected based on a proportionality 
of the dispersion to the square-root of the jet area. 
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Figure 17: Dispersion, <7Ap t > f° r RHIC, as a function of the jet pt, with two different choices for the 
ghost area, 0.01 and 0.0025. 

C.2 Quality of area determination 

One further source of Apt fluctuations can come from imperfect estimation of the area of the 
jets. We recall that throughout this article we have used soft ghosts, each with area of 0.01, in 
order to establish the jet area. That implies a corresponding finite resolution on the jet area and 
related poor estimation of the exact edges of the jets, which can have an impact on the amount 
of background that one subtracts from each jet, and, consequently, on the final dispersion. It is 
therefore interesting to see, figure [T71 that the dispersion a^ pt is reduced by about 0.2 — 0.4 GeV if 
one lowers the ghost area to 0.0025. 

While this doesn't affect any of the conclusions of our paper, it does suggest that for a full 
experimental analysis there are benefits to be had from using a ghost area that is smaller than the 
default FastJet setting of 0.01. 
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